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I. INTRODUCTION 


The long-term goal of this work is to develop a class of hybrid 
integrated-optical processors which will be capable of high-speed matrix 
computations. It is envisioned that the ultimate system will consist of an 
array of many integrated optical circuits (IOCs) of several different types 
which are interconnected in a programmable fashion to allow a variety of 
computational tasks to be carried out. 

The potential advantages of the hybrid integrated-optical processor 
are high computation speed, low power consumption and mechanical integrity, 
all of which are advantageous for the aerospace environment. Two of the key 
technical problems are the architectural strategies for computational IOCs 
and their interconnection, and the integrated optical lenses which are re- 
quired for compact IOC packaging. These topics are addressed in separate 
chapters of this report. 

The chapter on Integrated Optic Circuits for Matrix Computation 
stresses planar, as opposed to channelized, integrated optical circuits 
(IOCs) as the basis for computational devices. Both fully-parallel and sys- 
tolic architectures are considered and the tradeoffs between the two device 
types are discussed. It is then pointed out that the Kalman filter approach 
is a most important computational method for many NASA problems. This ap- 
proach to deriving a best-fit estimate for the state vector describing a 
large system will lead to matrix sizes which are beyond the predicted capa- 
cities of planar IOCs. It is shown that this problem can be overcome by 
matrix partitioning, and several architectures for accomplishing this are 
described. 


The Luneburg lens work has involved development of lens design 
techniques, design of mask arrangements for producing lenses of desired shape, 
investigation of optical and chemical properties of arsenic tri sulfide films, 
deposition of lenses both by thermal evaporation and by rf sputtering, opti- 
cal testing of these lenses, modification of lens properties through ultra- 
violet irradiation, and comparison of measured lens properties with those 
expected from ray-trace analyses. Lenses with apertures up to 1 cm and 
design speeds down to f/2 at this aperture were tried. The better evaporated 
lenses had focal spot sizes, at reduced aperture, no more than twice the limit 
set by diffraction effects. Initial sputtered lenses promised to be of com- 
parable quality; lenses made after the sputtering target had been in operation 
for some time, though, tended to absorb light excessively at the design wave- 
length, 633 nm. This effect appears to be related to a change in the composi- 
tion of the films. When a thoroughly reliable deposition and treatment proc- 
ess for chalcogenide lens materials is developed, straightforward design and 
testing improvements should permit fabrication of Luneburg lenses suitable 
for many beam- forming and signal -processing requirements. 

Although this report has two relatively independent major sections, 
the figure, equation, and reference numbers are consecutive. The subsections 
and pages are also numbered consecutively throughout; all the appendices are 
placed at the end. 
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II. INTEGRATED OPTIC CIRCUITS FOR MATRIX COMPUTATION 


1. INTRODUCTION 

The goal of this effort was to evaluate integrated optic architec- 
tures required to perform matrix algebra functions such as addition, subtrac- 
tion, multiplication and inversion and to combine these functions to obtain 
solutions to matrix algebra equations. It was desired that particular atten- 
tion be paid to optical implementation of systolic array architectures during 
these evaluations. 

In carrying out the program the operations of matrix-vector and 
matrix-matrix multiplication were emphasized over all others because of their 
importance in a large number of application areas and because these are opera- 
tions which consume a large amount of time and hardware when performed elec- 
tronically. The systolic architectures were stressed, but some attention was 
paid to looking at the implications of fully parallel methods, especially for 
the matrix-vector multiplication operation. 

From a more systems-oriented viewpoint, this study also touched upon 
some NASA applications for high-speed matrix processors, identified the Kalman 
filter as having a large number of important applications, and showed that by 
using standard matrix decomposition techniques, it is possible to use arrays 
of optical processors of limited size to carry out very large computations. 

In this section we deal both with the hardware and systems aspects 
of optical matrix multiplication. The hardware discussion begins with a des- 
cription of the basic integrated optic components, then progresses to inte- 
grated optic architectures for matrix multiplication, and ends with methods 
for assembling a number of basic multipliers to perform operations on large 
matrices. The remainder of the section is devoted to a discussion of some 
applications of the matrix-multiplication operation which should be of 
interest to NASA. 

Many of the basic functions which are required to construct inte- 
grated-optic computational devices can be implemented either in a planar 
or a channel waveguide geometry. In the work which has been under way at 
Battel le for the past few years, we have selected the planar geometry for a 
number of reasons. Some of these are: ease of fabrication, geometric 
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versatility, and the elimination of some of the complicating interference 
effects which arise when single-mode channel waveguides are merged. 

Much of the relevant work performed in our laboratories has in- 
volved the use of an interdigital electrode pattern, such as shown in Figure 1, 
which when deposited on a buffer layer on the surface of an electrooptic wave- 
guide can be used to modulate the intensity of, or to change the direction 
of, a planar guided wave. The buffer layer serves to isolate the electrodes 
from the waveguide so that the guided wave is affected only by the electri- 
cally induced periodic index-of-refraction variation and not by the presence 
of the metallization pattern. 

The tangential component of the electric field in the waveguide is 
the only field effective in altering the refractive index for the usual ar- 
rangement: TE-mode light propagating in the x direction in a Y-cut crystal 
of LiNb 03 . An expression describing this field has been derived by Engan.O) 
The fundamental component is given by 

E z = (0.847 ) f--) cos , (1) 

where g is the electrode gap width, and z is the distance from the gap center. 
In the Bragg regime only this component is effective. In the electrooptic 
waveguide this field results in an index-of-refraction modulation 

= ~ \ n J ff rE - (2) 

The index of refraction n e ff is the effective index of the guided mode, and 

r is the appropriate electrooptic coefficient. Since the electric field and 
the index modulation fall exponentially, it is desirable to use a waveguide 
which confines the light closely to the waveguide surface. On a LiNb 03 sub- 
strate, a Ti-indiffused guide is therefore preferable to an out-diffused 
guide. 

If we ignore the fall off of the field in the y direction, we can 
treat the periodic index variation as a simple thick Bragg grating, the 
Bragg angle 6 b being given by 

sin0g = X 0 /2n e ffA (3) 
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and the diffraction efficiency by( 2 ) 


n 


sin 


•rcAnd 

A COS0 D 

o 6 


(4) 


where A is the wavelength of the electrooptic grating, and a 0 is the vacuum 
wavelength of the light. 

In carrying out this work we have used electrooptic gratings with 
wavelength A, of both 13.33 pm and 8.41 pm. These have Bragg angles of 0.62° 
0.98°, respectively for He-Ne laser light in the Li Nb 03 waveguides. The 
measured diffraction efficiency for one of these gratings is shown in Figure 2. 
As can be seen, the maximum efficiency is about 95% and the behavior of the 
diffraction efficiency as a function of the applied voltage is a good fit to 
the behavior predicted by Eq. (4). These electrode structures are easily 
fabricated by standard photolithographic techniques and have a low capaci- 
tance allowing high-speed operation. 


2. ELEMENTARY ARITHMETIC OPERATIONS USING PLANAR IOCs 


We describe here some modifications of the basic interdigital elec- 
trode structure which allow a number of elementary computational functions 
to be performed. It should be noted that all of the computational schemes 
which we discuss are intrinsically analog in nature and can therefore be 
expected to have an accuracy of about 1% (6 to 7 bits), as compared to 16 bits 
or more for digital systems. This apparent disadvantage must be viewed in 
light of the very high-speed operation, low power dissipation and ease of 
fabrication which is expected to characterize the 10 devices. In addition 
there have been recent suggestions for architectures which have the potential 
for increasing the accuracy of optical devices to the 16-bit range and for 
incorporating floating-point operation. We have not yet attempted to work 
out all of the details involved in incorporating these improvements into a 
single IOC, but it is evident that there will be a significant increase in 
hardware complexity, not an unexpected tradeoff. 
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Figure 2. Diffraction efficiency versus voltage for electrooptic grating with 
A = 13.3 pm, d = 2.8 mm deposited on a LiNb 03 :Ti waveguide. Solid 
line shows the calculated diffraction efficiency and the points are 
observed data. 



Subtraction and Vector Subtraction 


We can combine Equations 2 and 4 and rewrite them as 

n = si n 2 [a (A-B) ] (5) 

where a contains all of the geometric and material parameters and A and B 
are the voltages applied to the left and right electrodes, respectively. 

The intensity of the diffracted beam is now seen to be proportional to the 
difference of the two voltages. If the electrode structure is extended as 
shown in Figure 3 and a lens is added to collect the contributions of the 
individual segments of the structure then the optical energy at the detector 
is given by 

N E 

E = £ • -jj 2 ' [sin 2 a(Aj - B-,- )] 

% ^ a 2 I(Ai - Bi) 2 . (6) 

It is evident that if A. and B. are the components of the N-dimensional 
-*■ ' * 

vectors A and B, respectively, the structure shown in Figure 3 produces a 
quantity proportional to the vector difference {t - $) 2 . Of course, this 
is true only when all Aj and Bj satisfy the condition 

a ( Ai - B-j ) % sin [a (A j - B-j)] (7) 

Multiplication and Vector Multiplication 

In Figure 4 are shown two electrooptic grating electrodes arranged 
in a herringbone pattern with a grounded spine. The angles are such that 
light diffracted by the first grating is incident upon the second grating at 
its Bragg angle. Twice-diffracted light therefore has an intensity which is 
proportional to the product of the diffraction efficiencies of the two grat- 
ings. In general, this intensity is proportional to the product of two sine 
functions, a quantity which is proportional to the product AB of the two 



A3 A2 Ai B-i B2 B3 



I = I I; = I a2(Aj - B,}2 


Figure 3. An extended electrooptic structure for 
performing vector subtraction. 
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Figure 4. A herringbone electrooptic structure for performing multipli- 
cation. The geometry is such that light diffracted by the 
first grating enters the second grating at its Bragg angle so 
that it can subsequently be diffracted by the second grating. 
The output light intensity is therefore proportional to the 
product of the two diffraction efficiencies. 





voltages only in the small signal approximation. Several methods for over- 
coming this nonlinearity are discussed in Appendix A. In the remainder of 
this section we shall proceed as if the linearization problem were handled by 
one or another of these methods. 

The extension of the herringbone structure as shown in Figure 5 
allows the generation of an optical signal whose power is proportional to 
the scalar product A-B. It is this herrinqbone structure or variations of 
it which form the basis for all of the matrix-multiplication devices we 
discuss. 


3. MATRIX-VECTOR MULTIPLICATION 

We describe here two approaches to matrix-vector multiplication, 
both of which make use of the herringbone electrode arrangement previously 
described. The first is an adaptation of electronic systolic array archi- 
tecture and the second is a fully parallel method. The comparison between 
the two approaches can provide the basis for some interesting tradeoff studies 
when all of the device parameters are available. 

The problem to be addressed is illustrated for a 3 x 3 matrix in 
Eq. 8, where the vector components are x-j , i = 1,2,3, and the matrix elements 
are aij . 


a ll x i + a 12 x 2 + a l 3 X 3 = Yl 
a 12 xi + a 22 x 2 + a 23 x 3 = y 2 

a l 3 X 1 * a 23 x + a 33 x = Y3 (8) 

The expansion of the multiplication 

A >T= y, (9) 

is written out in detail to emphasize the facts that each component of x 
is used three (N) times during the calculation, and that the calculation 
itself is composed simply of the sum of products. Both addition and multi- 
plication can be carried out quite naturally in an IOC, or, for that matter, 
in a bulk optical arrangement. The basic problem is to design an architec- 
ture which, most simply or efficiently, gets each of the x-j and a-jj to the 
proper position at the proper time. Both systolic and fully parallel methods 
of accomplishing this are discussed. 
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Figure 5. 


The herringbone structure extended to allow 
vector multiplication. 



Systolic Array Architecture 


The approach to computer design known as systolic array architec- 
ture was developed by Kung(^) and others as a method of approaching the prob- 
lem of VLSI computer design. The basic guidelines are: 

a. Each datum should be fetched from memory only 
once to avoid the "von Neumann bottleneck". 

b. Each chip should contain only a small number of 
different processor subunits, although these sub- 
units may be repeated many times on each chip. 

c. Connections between subunits should be only to 
nearest neighbors to facilitate the rapid flow 
of data and to simplify fabrication. 

The main disadvantage associated with the use of a systolic archi- 
tecture in an optical processor is that the progression of data in discrete 
steps requires electronic timing circuitry which can place a severe constraint 
on the ultimate speed of the system. Aside from this problem, we would be 
hard pressed to compile a better set of design guidelines for integrated opti- 
cal circuits than those listed above. The first guideline is certainly desir- 
able since we do not yet have available an optically addressable memory for 
IOCs, although some recent work(^) on surface holograms may be adaptable for 
this purpose. It is therefore essential that the recourse to memory be mini- 
mized since the act of fetching data from a digital store is much slower than 
the rate at which the IOC is capable of using that data. Second, at this stage 
in the development of IOC technology, we have only a small number of opera- 
tional building blocks available to us. The second guideline is therefore 
compatible with IOC technology, if only by default. The third guideline is, 
perhaps, not as important for optical as for electronic systems since it is 
possible to have optical carriers intersect either in planar or in channel (5) 
configurations without causing significant crosstalk. Complex interconnec- 
tion schemes can therefore be implemented without requiring a multilayer 
structure. However, since the progress of the data through an optical pro- 
cessor is controlled by the speed of light in the device and not by a digital 
clock, it will be necessary to pay attention to path lengths in high-speed 
devices to assure that proper synchronism of the data flow is maintained. 

The first optical matrix-vector multiplier based upon a systolic- 
type architecture was suggested by Caulfield, et al(6) (Figure 6). This is 
an example of an optical implementation of Kung's systolic architecture as 
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VI 


V2 


Figure 6. A suggested optical implementation of a systolic matrix vector multiplier (reference 6). 
Matrix values are introduced as LED intensities, vector values are introduced as 
surface acoustic wave amplitudes, and the products are summed in the CCD analog 
shift register which also tracks the position of the output. 


CCD Analog 
Shift Register 




modified by Tamura(^) for optical implementation. The object of this so-called 
"engagement" architecture is to arrange for the data to flow through a series of 
cells which accept pairs of inputs and accumulate the sums of the products of 
the pairs. This data flow is illustrated schematically in Figure 7. In the 
implementation shown in Figure 6 the light sources are modulated in proportion 
to the matrix elements and the vector components are carried through the engage- 
ment region by a properly modulated acoustic wave. Since the data flow is 
essentially one-dimensional, this architecture can be implemented in either 
bulk- or integrated-optic form. 

In an alternative scheme^ 8 ) which was devised solely for IOC imple- 
mentation, the engagement region consists of an extended herringbone electrode 
structure. The entire IOC, which is currently under construction, is shown 
schematically in Figure 8. It consists of the herringbone structure, shown 
in the figure as two integrated optical spatial light modulators (IOSLMs) tilted 
at an appropriate angle, collimating and imaging lenses along with a beam stop 
to prevent the undiffracted and singly diffracted light from reaching the 
detector array, and a suitable butt-coupled laser diode light source. The 
following figure (Figure 9) shows a schematic of the electronics required to 
exercise the device. It is assumed that both the matrix and vector values 
are stored in a digital memory. It is seen that a formidable array of shift 
registers and D/A converters are required to perform the introduction of 
electrical data. 

There is an obvious tradeoff between the acoustic and electrooptic 
approaches. In the former, the vector components proceed naturally through 
the engagement region, carried by the acoustic wave. However, the data-rate 
is limited by the acoustic velocity and must, in fact, be synchronized with 
the rate determined by the acoustic velocity and the cell size. If we assume 
each datum is represented by at least a 100 pm-long SAW, and that the device 
is built in LNO, then the maximum data-rate is 35 MBit/sec. In the electro- 
optic multiplier, the data-rate is determined by an external electronic clock 
or shift register (which is also required to modulate the light sources in 
the acoustic device). Since the electrode capacitance is less than 20 
pf/element, a data-rate of 500 MBit/sec should be possible, assuming a 50 ohm 
source impedance. The trade off is that to drive the electrooptic device 
additional external electronics are required. However, the speed advantage 
over SAW or pure electronic devices may make this a very favorable trade. 
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Figure 7. The engagement architecture for vector-matrix multiplication indicating 
the data flow and the product accumulation. 
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Fiqure 8. A suggested integrated optical circuit for performing vector matrix 
multiplication using the engagement architecture. The diode laser 
continually illuminates the interaction region which consists of two 
integrated optical spatial light modulators in the herringbone con- 
figuration. The lens images the interaction region on the detector 
array after the undiffracted and singly diffracted light has been 
removed from the beam. 








Figure 9. Digital drive circuitry for exercising the engagement processor. The 
parallel-in serial-out (PISO) shift registers receive the'stored data 
from a digital memory and clock it through a D/A converter into the 
proper electrodes of the IOSLMs. 
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Fully Parallel Architecture 


One of the traditional advantages of an optical approach to signal 
or data processing is the potential for utilizing a fully parallel architec- 
ture. In the case of matrix-vector multiplication, this means that all data 
are entered simultaneously and all multiplications and sums are carried out 
as the data are entered. The first suggestion for a fully parallel optical 
approach to matrix-vector multiplication was made by Goodman (9) in 1978. 

The Goodman approach provides an excellent basis for discussing 
some of the advantages and the problems associated with optical numerical pro- 
cessors. The earliest version of the Goodman matrix- vector multiplier is 
illustrated in Figure 10. Vector components are introduced as LED intensi- 
ties X],X2...xn and the matrix components by a mask. The x-j are distributed 
over the appropriate aj-j mask locations by an anamorphic lens arrangement. 

The products are directed to the appropriate summing detectors by an ortho- 
gonal lens arrangement. The advantage of this configuration and, indeed, the 
basic rationale for the optical approach is its speed; answers appear as fast 
as the x-j are varied. The system latency is simply the time taken for light 
to traverse it, about 0.3 nsec for a 10 cm device. Goodman et al^^) dis- 
cusses several variations of this device. In one the anamorphic lenses are 
replaced by multimode slab waveguides and in another by fiber bundles. 

All of these devices have the property of performing the matrix- 
vector multiplication in a fully parallel manner. They also have two ob- 
vious disadvantages. First, the device is not programmable and can there- 
fore perform only one function. This "hard- wired" characteristic is 
common to all the numerical optical processors we will discuss. Second, 
they can handle only real, non-negative quantities, a point which can be 
addressed below, and third, there is no high-speed method for changing the 
matrix mask. This last disadvantage has been overcome in several devices 
suggested by other authors, but as could be expected, at the expense of 
additional complexity and, in case of serial input devices, at the expense 
of a great reduction in speed. 

The Goodman architecture handles a two-dimensional data array 
(the mask) by means of a three-dimensional geometry, and therefore cannot 
be directly implemented in a planar integrated-optic format. It is 
however possible to design an IOC which can perform the fully parallel 
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Figure 10. Schematic of the fully parallel method of performing matrix-vector 
multiplication. Vector components are introduced as optical inten- 
sities which are fanned-out horizontally across the rows of the 
matrix mask. Light from each column is then summed and focused on 
the detectors whose output is proportional to the components of the 
product vector. 



operation. A schematic depicting such a device is shown in Figure 11. Once 
again, both matrix elements and vector components are introduced as voltages 
on electrooptic modulator segments. However, all voltages may be applied in 
parallel, the vector components being imposed upon a guided plane wave by an 
N-unit electrooptic IOSLM. These values are then distributed by fixed sur- 
face gratings so that they impinge, in parallel, upon the matrix-element modu- 
lators. As in the Goodman device, summation is performed with lenses and the 
device falls into the space-integrating category. Assuming a 3-cm path in 
LNO, the intrinsic processing time for such a device is 0.2 nsec. Of course, 
S/N and dynamic range requirements will certainly demand a larger integration 
time, but lOnsec/mult. should be realizable. 

One of the most obvious of the trade-offs between the engagement 
and fully-parallel approach is speed vs. hardware complexity; The engagement 
processor requires a modulator N units wide. The direct processor requires a 
modulator units wide. The largest IOSLM constructed to date has thirty-two 
100 pm-wide units — a modulator 100 units wide is certainly possible, so a 
single engagement processor could handle a 100 x 100 matrix, and a direct pro- 
cessor a 10 x 10 matrix. The tradeoffs between the two approaches are sum- 
marized in Table 1. 
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Figure 11. IOC for direct vector matrix multiplication. The IOSLM in the lower 
left is illuminated by a uniform guided wave. This guided wave is 
modulated in proportion to the vector components as shown. This in- 
formation is then distributed via beam splitters to the modulators 
which carry the matrix information. As in Figure 10, the summation 
is performed optically and the resultant light is imaged on the 
appropriate photodetector. 



TABLE I. COMPARISON OF DIRECT AND ENGAGEMENT ARCHITECTURES 


Engagement Direct 


Data Flow 

Electronic Interface 

Natural Device Geometry 
Speed 

Electronic Interface 
IOC Size 


Stepped 

Parallel set of sequen- 
tial inputs 

Planar 

Limited by electronic 
clock and/or shift 
register 

Complex: N+l shift regis- 
ters, 2N D/A converters. 
All data moves at high 
speed. 

Maximum IOSLM size: N 


Continuous 
Fully parallel 

3-D 

Limited by detector 
SNR 

Moderate: Only time- 
dependent values must 
change 

Maximum IOSLM size: 
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4. MATRIX-MATRIX MULTIPLICATION 


Matrix-vector multiplication can be described as an N 2 problem, since 
N 2 multiplications are required to produce the components of the product vector. 
In the process of obtaining the desired result, each vector component is used 
N times, and each matrix element is used once. The matrix-matrix multiplica- 
tion problem is, on the other hand, an problem, the components of both 
matrices all being used N times in the computation. More specifically, the 
problem is to compute 

C = A-B (10) 


where the ij element of C is given by 

N 

c ij = l a ij b jk 
j=l 

For a 3 x 3 matrices, for example, c-| 3 is 


(ID 


C 1 3 = a ll b 13 + a l 2 b 23 + a l 3 b 33 (12) 

The systolic array architecture for carrying out this computation 
is shown in Figure 12. The data are stepped through the engagement region 
in the sequence shown. Each of the boxes, c-jj, computes the produce of each 
pair of simultaneously incident quantities and accumulates a running sum of 
the products. 

Because of the higher dimensionality of this problem, we have not 
been able to devise a reasonable design for a fully parallel matrix-matrix 
multiplier, although such designs are possible in the world of three-dimensional 
optics. We have, however, arrived at two IOC matrix-matrix multiplier designs 
which are based upon the engagement algorithm. The first of these is shown in 
Figure 13. The computational units are composed of a herringbone electrode 
structure which performs the multiplication, and a detector with sufficiently 
long time constant to perform the sums. 

The intensity of the light diffracted by the herringbone structure is 
proportional to the product of the two analog voltages applied to the struc- 
ture. These voltages must be stepped through the device in synchronism as 
suggested in the figure. This may be accomplished using an analog shift 


23 




Figure 12. Systolic array architecture for matrix-matrix 

multiplication showing the flow of data through 
the computational elements. Each element per- 
forms the sum-of-products operation. 
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register or with digital shift registers and, for the multiplication of two 
N x N matrices, 2N^ D/A converters. At this time, the major problems with 
the IOC of Figure 13 are the massive amount of high speed electronics required 
and the fact that, for the configuration shown, a single IOC with ele- 
mentary computational units must be employed. The second of these problems 
is overcome in the device shown in Figure 14. 

The modified matrix multiplication IOC sketched in Figure 14 com- 
bines some of the features of Figure 13 with some of the features of the 
matrix-vector multiplier shown in Figure 11. As in the latter device, the 
herringbone structure has been split into two segments and beam splitters are 
used to distribute the information encoded in the light beam. The modified 
matrix multiplication IOC has the following advantages over the device sug- 
gested in Figure 13. 

• Because the b rs values are distributed optically rather 
than electronically, the data can, for most cases, be 
considered to be applied simultaneously to the appropriate 
a mn matrix element array to advance in a rectangular rather 
than a skewed array. The result is to reduce the processing 
time by N-l beats. 

• The geometry of Figure 14 suggests that a natural split 
occurs after each row of A. Therefore, by using parallel 
jj inputs to a number of IOCs, each IOC could calculate one 
one of the row vectors of C, and these calculations could 
be done simultaneously. 
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Figure 14. A modified IOC for matrix-matrix multiplication by 
the engagement algorithm. In this device almost 
half of the modulator units are replaced by the 
grating beam splitters which function to distribute 
the b-jj information in a manner similar to that 
shown in Figure 11. 
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Time Epoch 
1 2 3 

C 11 = a 11 b 11 + a 1 2 b 21 
c 12 = + a 11 b 12 + a 12 b 22 
C 21 = a 21 b 11 + a 22 b 21 
c 22 = + a 21 b 12 + a 22 b 22 

C 11 = a 13 b 31 + a 14 b 41 

c 12 = + a 13 b 32 + a 14 b 42 

C 21 = a 23 b 31 + a 24 b 41 

c 22 = + a 23 b 32 + a 24 b 42 


C 11 = a 15 b 51 + a 16 b 61 

c 12 = + a 15 b 52 + a 16 b 62 

C 21 = a 25 b 51 + a 26 b 61 

C 22 = + a 25 b 52 + a 26 b G2 


Time 
Epoch 
b 22 3 


a 22 a 12 b 12 b 21 2 
a 21 a 11 b 11 1 



Figure 15. Using 3 2x2 processors to compute the products 
required for four of the terms in a 6x6 product. 

To complete the calculation the sums 
_ 3 k 

C-jj = E C-jj must be computed. 

K — 1 
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5. SOLUTION OF SYSTEM CONTROL EQUATIONS 


A wide variety of problems of great interest to NASA are readily 
formulated in terms of large linear algebra problems which need to be solved 
very rapidly with a small, low-power-consumption computer. Tracking and 
mechanical control are examples of such problems. The universal tool for all 
tracking operations and indeed for most control operations is the Kalman 
filter. The Kalman filter is essentially a means of predicting the next 
number or, as is most commonly conceived, the next vector occurrence. It 
accomplishes this by estimating the next occurrence as the previous vector cor- 
rected by a factor proportional to the difference between the presently- 
observed vector and the predicted vector. The proportionality factor is the 
Kalman factor usually symbolized by the letter K. The Kalman estimated state 
vector is preferred over any measured value for two reasons. First, we seldom 
measure the state vector directly and our measurements often do not involve all 
of the components of the state vector. Second, our process and our measurement 
are noisy and hence subject to error which can be minimized by appropriate 
statistical techniques. Thus, the Kalman estimated state vector gives the 
statistically-best estimate of the true state vector we can obtain at the 
time. 

Kalman filtering is usually regarded as so complicated that it must 
be accomplished in a digital computer. Thus the event is regarded as being 
discretized in time. Of course, the time interval must be chosen to be 
commensurate with the calculational speed of the computers involved in cal- 
culating the Kalman filter. For large problems, the Kalman filter can be 
calculated only if a variety of the system parameters remain constant with 
time. Actual NASA events of interest are continuous in time, so continuous 
Kalman filtering is appropriate. We believe it will be possible to set up 
simple analog optical computers to perform continuous Kalman filtering in 
real time. 

The following brief mathematical description of Kalman filtering 
will suffice to show the computations which are involved. 

We suppose we have a k dimensional state vector x satisfying 

= A(t)x + n(t) (13) 


29 



where A(t) = k x k matrix which may vary with time, and n(t) = k dimensional 
noise vector which has an expected value of zero but a covariance matrix which 
may be time dependent. We measure an r-dimensional vector 

y(t) = M(t) x (t) (14) 

where M(t) = r x k matrix which may be time dependent. Our goal is to find 
the best estimate x e (t) to x(t) given y(t). 

The Kalman filter makes use of all of the quantities just defined as 
well as of the continuously updated state vector covariance matrix P(t) and the 
noise covariance matrix 


E[n(t] ) - n(t 2 )] = Q(t) . 


(15) 


The Kalman gain function is 


K(t) = {[PAT(t) + Q(t)]MT(t) 

+ PM T (t)} [M(t)Q(t)MT(t)]-l. (16) 

Using these, Fagin(^) showed two equivalent optimum block diagrams for 
accomplishing the desired estimation. These look complicated but they need 
not be because many of the boxes each representing a matrix multiplication 
may contain a constant matrix. It is clear, however, that optimum estimation 
requires only matrix 

• multiplication, 

• transposition, and 

• inversion 

along with a memory for temporary storage of partial results. 

Our consideration of optical architectures for the solution of 
the entire Kalman filter problem is by no means complete. However, we have 
begun to attack this problem in a systematic way, beginning with the realiza- 
tion that it is possible that the size of the matrix for a particular problem 
is too large to be handled by a single integrated optical processor. 
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Our approach to solving this problem is to: 

(1) design the appropriate integrated optical processors, 

(2) design suitable algorithms for those processors, and 

(3) assemble "small" processors into systems capable of 
operating on the full-sized matrix. 

Steps (1) and (2) have been discussed above. We will now concen- 
trate on ways to overcome hardware constraints on processor size. 

The Multiplication of Large Matrices 

The approaches we have suggested for constructing IOCs for carrying 
out matrix multiplication use IOSLMs or herringbone electrode structures which 
will probably be limited to a mximum of about 100 elements. This means that 
the engagement and the fully-parallel devices will be limited to matrices of 
about 100 x 100 and 10 x 10, respectively. 

There are several reasons that the size of the IOCs is limited. 

The most important is that, for proper operation, it is necessary to illumi- 
nate the active region with a rather uniform, plane guided wave. It is not 
feasible to reduce the width of the individual modulator units to much less 
than 100 micrometers. It is also not practical to attempt to generate a 
uniform guided wave with a width more than 1 cm. These two figures combine 
to produce the 100-element limit. 

Another limiting factor is the number of connections which can be 
made to a single IOC. Although this number certainly exceeds 100, the 200 
connections which are required to address the 100 x 100 engagement processor 
is getting close to the upper limit. 

The approach we use to overcome hardware constraint on the size of 
the matrix multiplier is based upon the fact that any matrix can be subdivided 
or partitioned intoanumber of smaller submatrices. O 3) When multiplying two 
conformable matrices which have been partitioned in a compatible manner, the 
submatrices can be treated just as if they were scalar elements. As a simple 
example we consider the product A-B = C where A, B, and C are 6 x 6 matrices 
and the submatrices are 2x2. 
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( 17 ) 


A n' A i2' A i3 


’ b 1V b 12' b 1 j" 
“ “1 * " 1 * * 

[C]l'Ci2'Ci3 
1 1 

a 21 ! A 22 ! a 23 


b 21.' b 22 ' b 23 

= C 2 1 ' C 2 2* c 23 

* * 

_ a 31 ! a 32 ! a 33 


b 31 B 32! b 33 

|_ c 31 ! C 32 ) C33 


where, for example. 


c n = A 11 B 11 + a 12 b 21 + A 1 3 b 31 


‘ c ll c l 2 
. c 2i c 2a 


(18) 


Note that each term in Eq. 18 is a 2 x 2 matrix product and each contains a 
contribution to Cn, C-| 2 » ^21 » and C 22 » which are four of the desired 36 
matrix components. It is obvious that the algorithm which allows us to avoid 
performing a large matrix multiplication demands that we not only perform a 
large number of smaller multiplications, but that we devise a system for 
carrying out the required additions. 

In the example discussed above, we have replaced a 6x 6 matrix 
multiplication with 27 2x2 multiplications and 36 3 - number sums. In a 

more realistic example we might have chosen to carry out a 128 x 128 multi- 
plication with 512 16 x 16 processors. The final output would then be 128 
8-number sums. In general we can perform an NM x NM matrix-matrix multiplica- 
tion by N b MxM multiplications. The memory required is no more than that 
needed to perform any NM x NM matrix multiplication because the submatrices 
can be accumulated. 

The matrix-multiplication engagement processor could be used as the 
basic IOC for carrying out the submatrix multiplications. It requires 2N-1 
clock pulses to perform an N-dimensional matrix multiplication. The data 
flow for C-| 1 of the 6x6 example is shown in Figure 15. The sums can be 
carried out optically by arranging the processors so that all of the appropri- 
ate optical outputs fall on a common detector, or electrically with individual 
detectors for each processor and a series of summing circuits. Note that since 
all of the submultiplications can be carried out in parallel, there is a poten- 
tially large reduction in the processing time. Assuming that there is a con- 
vene!' nt way of formatting the data into the proper sub-groups, that the data 
are clocked into the processor at a constant rate for all examples, and that 
optical summing is used there is a factor of 11/3 reduction in our 6x6 ex- 
ample, and a factor of 225/31 reduction in the processing time for the 
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128 x 128 example. There is an obvious hardware/processing-time tradeoff 
which makes this a very attractive approach to consider for handling large 
problems. 

Another approach to this problem which used a single IOC but more 
electronic hardware is discussed in Appendix B. The question of numerical 
computation is addressed in Appendix C. 

Matrix Inversion 

Because matrix-matrix multiplications require the same order of 
magnitude of calculations as matrix inversion the concept of using an itera- 
tive matrix-matrix inversion scheme is of no interest for electronic digital 
computation. Recently, however, several schemes for optical matrix-matrix 
inversion scheme is of no interest for electronic, digital computation. Re- 
cently, however, several schemes for optical matrix-matrix multiplication 
have been devised. (^* Because these allow very fast computation, it 

is of interest to apply this technique to various problems in linear algebra. 
One such problem, eigenvector/eigenvalue solution, is easily attacked by a 
matrix power method described elsewhere. 0^) Here we describe the inversion 
of the matrix A by iterative matrix-matrix method. 

The k^/ row of the matrix product AB has as its jth/ column 

Cj,k = | a j£ t5 £k » (19) 

where the a's and b's are the components of A and B. If now we fix k we 
find that we do not need all of the elements of B to calculate Cjk- Rather 
we need only bq ^ , b2k» ...» bpjk» the k^/ column of B. We want the particular 
case 

C jk = 5 jk = } f f j f k (20) 

Thus we can write 

C + I = [«jll«j2l - - - I «Jn3 (21) 

and 

B = A-l = [bj] | bj2 |... |bj N ]. (22) 
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We compute A"1 by solving N equations of the form 


A Sk = Ck (23) 

where 

= ( b lk» b 2k> •••> b Nk) T (24) 

and 

Ck = (6lk» 5 2k» •••> 5 Nk) T • (25) 


Fortunately the literature is replete with iterative solutions to 
Equation 23. For example, Ralston^ 7 ) gives three methods. Of these, one 
(the Gauss-Seidel method) converges for any positive definite A and converges 
faster than the other two methods discussed. 

We now show how the Gauss-Seidel method can be extended to matrix 
inversion. Write 


A = L + U + D, (26) 

where D is a diagonal matrix and L and U are, respectively, lower and upper 
triangular matrices. We have 


AA - 1 = I 


or 

(L + U + DJA' 1 = I. 

Therefore 

(L + D)A"1 = - UA' 1 + I 
and 

A -1 = - (L + D)" 1 UA-1 + (L + D)”l I . (27) 

We start with some "approximate" inverse matrix (A _1 )o and calculate an 
improved solution 


(A" 1 )! = -(L + D)"l U(A-1) 0 + (L + oHl, etc (28) 

Ralston shows that if A is positive definite, this iteration converges 
independently of our choice of (A _1 )q. Writing 
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B = -(L + D)” 1 U and 
C = (L + D)“1l = (L + D)-l 

the iteration is 

(A"1 ) n+1 = B(A-l) n + C. (29) 

The multiplication by B is easily carried out by optics. The addition of 
C can be carried out electronically. 

Finding B is a far easier task than inverting A, because L+D is 
essentially lower triangular. 

With no precalculation at all we can do Jacobi iteration: 

(A-l) n = -D-1(L + U)(A~1 ) n _ 1 + D-l ( 3 °) 

This is because inverting D is totally trivial: 

<<1J ■ • (dij 

This is not guaranteed to converge unless the Euclidean norm of D"^(L+U) is 
less than one. This is often the case, so the convenience of not having to 
invert (L+D) may lead to a preference for this method. 

An iterative linear equation solver proposed for optical solution 
to 

Ax = t, (3D 

is 

x n = (I + A)x n _i - t. (32) 

This has the same sort of convergence burden of proof (||I +A|| < 1) as 
the Jacobi method. Reworked, it reads 

(A-l)„ = (I + A)(A-1) n _! - I. (33) 

A general block diagram for the iterative solution of the equation 

BM-j + C = Mi + i (34) 

is shown in Figure 16. 
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A flowchart for iterative matrix inversion. 








III. LUNEBURG LENSES 


6. INTRODUCTION 


The objective of this portion of the program is the development 
of procedures for the design, fabrication, and testing of Luneburg 
lenses for integrated-optical devices. The lenses are produced by 
deposition of arsenic tri sulfide glass layers of prescribed profile 
onto the surface of optical waveguides made by the diffusion of titanium 
into Y-cut lithium niobate crystals. 

A conventional Luneburg lens for use in planar integrated 
optics consists of a layer of high-refractive-index material deposited 
on a portion of an optical waveguide and radially symmetric about an 
axis through the center of the lens and perpendicular to the plane 
of the guide. The effective refractive index of the guided mode is 
changed locally by an amount dependent on the thickness of the overlay 
at the point in question. The Luneburg lens is an example of a gradient-index 
lens, in which focusing occurs because of the difference in refractive 
index between adjacent rays. If the lens profile— thickness as a function 
of radius — is properly chosen, a perfect geometric focus can be obtained; 
that is, the lens will be diffraction-limited. The geodesic lens 
and the diffraction, or grating, lens are other types of waveguide 
lenses that may be produced with short focal lengths. The relative 
merits of these types of lenses are discussed in our paper "Evaporated 
AS2S3 Luneburg Lenses for LiNbC>3:Ti Optical Waveguides," for which a full 
bibliographic reference may be found in Appendix D to this report. 

Just as there are several types of waveguide lenses which 
should be considered for a given application, there are a number of 
materials which need to be evaluated if a Luneburg lens is to be used. 

The lens material should have a refractive index higher than the waveguide 
surface index, should not cause excessive absorption or scattering 
of the guided light, and should be easy to deposit on the waveguide 
surface. For lenses on Li Nb03 waveguides, arsenic trisulfide glass 
is one of the few known materials to meet these criteria. Arsenic 
triselenide and more complex chalcogenide glasses also have high refractive 
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indices, but they have fundamental absorption edges more toward the 
infrared and cannot be used in the visible. ZnS and CdS have large 
energy gaps and high indices of refraction, and they can be deposited 
on LiNb03. These materials form polycrystalline films; obtaining films 
of good optical quality requires some care. Certain oxides, such as 
Ti02, might also be used with LiNb03 waveguides. It is difficult to 
keep the oxygen content of deposited oxide films high enough to obtain 
a refractive index comparable to those of the crystal, although some 
sputtered Ti02 films on LiNb03 with refractive index as high as 2.6 
have been prepared (R. Holman, personal communication. The optical 
quality of these films was not further assessed.) 

In the remainder of this portion of the report, we will first 
describe our procedures for fabricating and testing the optical waveguides 
and the AS2S3 layers. Then we will describe the procedures for designing 
Luneburg lenses of prescribed characteristics and for designing masks 
suitable for making, by evaporation or sputtering of the glass, lenses 
of the desired profile. Ray- tracing will be seen to play a significant 
role in assessing the adequacy of these designs. Finally, we describe 
the fabrication and testing of -some selected lenses, and we conclude 
by summarizing the progress made, the problems encountered, and the 
questions remaining. 


7. WAVEGUIDE AND SUBSTRATE 


For all but the most preliminary work, it is necessary for 
accurate lens design to know the refractive index of the guided mode 
and the index at the guide surface to the third decimal place. This 
means ,in turn, that the waveguides have to be fabricated by a reproducible 
process on wel 1 -characterized substrates. Commercial lithium niobate 
crystal plates are somewhat variable in optical properties and diffusion 
coefficients, complicating the characterization process. In parallel 
work, we are attempting to correlate such variable properties with 
one another and with factors such as sample stoichiometry in order 
to allow production of waveguides of fully predictable behavior. 
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In the process most frequently used to make the waveguides 
employed in the present experiments, a film 17.5 nm thick of high-purity 
titanium is deposited on the LiNbC>3 crystal surface. The titanium 
is then indif fused at 1000 C for 2.5 h in an atmosphere of slowly flowing 
oxygen which has been bubbled through water at 90 C. The water vapor 
compensates for out-diffusion of lithium which occurs during the diffusion 
anneal. This procedure produces waveguides supporting a single mode 
of each polarization at wavelength 633 nm. As no residual diffusant 
remains on the surface, the waveguide is presumed to have a depth profile 
of approximately Gaussian shape. To describe such a profile, two parameters — 
the surface index and the diffusion depth — are required, and both cannot 
be obtained from the propagation constant of the guided mode, but reasonable 
extrapolations can be made from data on two- and three-mode guides. 

In some cases, guides supporting more than one mode of a given polarization 
have been used. Of course, only one mode can be expected to be sharply 
focused by the lens. 

To determine to sufficient accuracy the waveguide surface 
index, we need to know, in addition to the mode indices and something 
about the refractive index profile in the guiding layer, the substrate 
refractive index. We have found that this quantity can be measured 
quite accurately by a recently described^) prism coupling method. 

In our implementation of this method, a symmetrical SrTi03 prism is 
clamped to the sample, as shown in Figure 17, and light is brought to 
the prism-sample interface through one remaining prism face much as 
one excites a propagating mode in a waveguide with a prism coupler. 

The amount of light reflected from the prism-sample interface is recorded 
as the angle of incidence is varied. When the angle of total internal 
reflection at the interface is approached, the reflected intensity 
increases rapidly. In spite of effects of imperfect beam collinearity 
and varying air gaps between the prism and sample, it is generally 
easy to determine the angle of total internal reflection to within 
1' of arc and thence to calculate the sample refractive index to +0.0001 
or better. The refractive index and angles of the prism are found 
in the conventional way using a prism spectrometer. The ordinary refractive 
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Figure 17. Prism-coupling method for measuring refractive 
indices of Li Nb 03 plates. 
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index of lithium niobate varies little from sample to sample, a typical 

value at 633 nm wavelength being 2.2865. The extraordinary index is 

more variable, ranging from 2.1996 to 2.2032, presumably as a result 

(19) 

of changes in stoichiometry/ ' This method has the advantage that 
measurements may be carried out either before or after the waveguide 
layer is formed. When the waveguide is present, it is possible under 
favorable circumstances to determine the guided-mode indices as well 
(H. Onodera, personal communication). More often, however, these values 
are obtained using rutile prisms to couple into and out of the guided 
modes. Measurements on one waveguide supporting two modes of each 
polarization yielded the following representative results for the TM 


polarization: 

substrate index 2.2868 
0 -order mode index 2.2892 
surface index 2.300 
diffusion depth 2.2 pm 


assuming a Gaussian profile and the validity of the WKB approximation^) 
for evaluating the surface index and the waveguide depth. This waveguide 
was used in several of our experiments on determining the refractive 
index of the deposited films and the data given above have been used 
in some of our more recent lens designs. 

8 . AspSq FILM DEPOSITION 

Arsenic trisulfide film lenses have been fabricated by two 
physical-vapor-deposition processes, rf-sputtering and thermal evaporation. 
Uniform-thickness films were also prepared by these methods for measurement 
of the film optical properties. In this section we describe the film 
deposition methods and the methods for determining the principal properties 
of the films prepared. 

The experimental arrangement for making AS 2 S 3 films by thermal 
evaporation is illustrated schematically in Figure 18. The process 
is carried out in a conventional bell-jar high-vacuum evaporation system, 
pumped down to a pressure of 1.0 x 10~ 5 torr at the start of the evaporation. 
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Figure 18. Experimental arrangement for making arsenic trisulfide Luneburg 
lenses by evaporation. Not shown is a piezoelectric thickness 
gauge mounted close to' the crystal. 
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The source is 99.99% pure fused glass which has been hand-ground in 

a porcelain mortar and pestle to a fine powder, approximately 325 mesh. 

The powder is evaporated from a quartz crucible, 18 mm in diameter at the 
top, held in a tungsten basket. Best results are obtained when the 
crucible is around half full. The evaporation temperature is estimated 
to be in the 500 to 700 C range and the source-to-substrate distance is 
typically 100 mm. In most experiments, the film deposition rate was 
around 20 nm/sec. Lenses fabricated at deposition rates of 2 nm/sec had 
similar properties to those deposited at the higher rate. The masks, 
which are used to shape the deposit to the desired thickness profile, 
are made of thin sheets of aluminum. The crystal substrate was not heated 
or cooled during the evaporation. 

Sputtering of AS 2 S 3 thin films was first described by Watts 
and co-workers ' '• Our experiments were carried out using the same 

vacuum system used for the evaporation work. A special baseplate was 
constructed allowing easy conversion of the system between the two 
deposition methods. For deposition of chalcogenide glasses a dedicated 
system is needed because of the relatively high vapor pressure of the 
materials. 

The experimental arrangement for rf sputtering is indicated 
in Figure 19. Not shown in the drawing is a movable aluminum mask which 
may be inserted just above the thick profiling mask and the film thickness 
gauge in order to permit presputtering of the target. The target is 
a polished disk of 99.99% pure AS 2 S 3 glass, 102 mm in diameter and 
6 mm thick, obtained from Unique Optical Company, Farmingdale, NY. 

It was fastened to the target electrode with Epon epoxy resin. Before 
the target was used, it was presputtered for 12 h to remove contaminants. 

It is kept under vacuum when not in use. Most of the substrates used 
are 3 mm thick; so the target-to-substrate distance is 23 mm. The 
sputtering gas is argon, either standard laboratory grade or high-purity. 
One attempt at sputtering in high-purity nitrogen yielded a whitish 
powdery deposit, which was not analyzed. To produce films, the system 
is pumped down to about 8 x 10 "® torr; then gas is admitted and the 
films are sputtered at about 35 urn argon pressure. The operating frequency 
is 13.56 MHz and the sputtering is typically done at 20 W forward power. 
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Figure 19. Experimental arrangement for making arsenic trisulfide Luneburg 
lenses by rf sputtering. Not shown is a movable shutter to 
allow pre-sputtering of the target. 


J -J —1 > ) „ j 





The film deposition rate is typically 12 nm/min, but can be varied 
from 10 to 18 nm/min at this power, depending on the standing-wave 
ratio and the other conditions. 

Both methods yield films which adhere well both to glass 
and to lithium niobate substrates. Under magnification, the initial 
sputtered films appeared smoother. Evaporated films often show small 
pockmarks and granules adhering to the surface, which seldom, however, 
affect the optical properties of lenses or other films in any obvious 
way. Problems were encountered, on the other hand, with some of the 
sputtered films after the system had operated a while, as discussed 
below. 

Once films have been prepared under specified conditions, 
their properties are measured. The minimal set of parameters that 
must be determined for design and fabrication of lenses is 

(1) thickness, and thickness profile in the case of lenses; 

(2) film refractive index at the design wavelength (the 
633 nm line of a He-Ne laser in all the work reported 
here); and 

(3) change in refractive index upon illumination with short- 
wavelength light or upon annealing just below the glass 
transition temperature ( 2 ^-24) > j hl - s phenomenon ma y b e 
used to adjust lens properties after fabrication, or 

it may lead to gradual change in lens properties over 
time if the lenses are not either protected or fully 
desensitized by intentional annealing or illumination. 

We will describe measurements of each of these parameters in turn. 
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1. Lens shape and film thickness. The overall 
thickness of the lens is fixed by measuring the mass 
of material deposited on a piezoelectric thickness monitor 
mounted near the crystal, but away from the masks. The 
mass/thickness ratio is determined by profilometer thickness 
measurements on test specimens. The profilometer, a 
Taylor-Hobson Talysurf 4, or more recently a Talysurf 
6, is also the principal instrument used to determine 
the lens profile. The stylus does not damage the arsenic 
trisulfide glass or the waveguide layer, and it is not 
difficult to make a traverse through the thickest part 
of the lens. One difficulty with Talysurf measurements 
is that the substrate is often found not to be flat; 
so an extrapolation of substrate surface position beneath 
the lens is required in order to determine the lens thickness 
Deposition of the lens may cause some warping of the 
substrate. Departures from planarity are not sufficient, 
though, to have any effect on the properties of the guided 
waves or on the lenses. 

The profilometer work is complemented by interferometry 
A Twyman-Green arrangement is used to provide interferograms 
of the lens shape; an example is shown in Figure 20. Both 
white-light and monochromatic illumination yield informative 
fringe patterms. The interferograms are particularly good 
for detecting shape distortions resulting from misalignment 
of masks or from substrate imperfections. 




2. ^$ 2^3 refractlve index. The refractive indices at 

633 nm of arsenic tri sulfide films deposited according to our 
procedures have been measured by a prism deflection technique. 
Prisms of uniform thickness and apex angle 30° are deposited 
on LiNbO-j waveguides through triangular masks held close to 
the guide. The prisms are oriented symmetrically with respect 
to the z-axis of the crystal and a 633 nm guided beam is 
coupled into the waveguide so it propagates along the x-axis 
of the y-cut crystal. After the beam is deflected by the 
overlay prism, it is end-fired out of the guide through a 
polished edge and its deflection measured on a screen about a 
meter away. Because of the steep edges of the prism, some 
light in the prism region is frequently scattered at the input 
edge into higher-order modes supported in this region. The 
light is subsequently scattered back out into the substrate 
fundamental mode at the output edge; so there are often 2 or 
3 deflected beams corresponding to different modes in the 
prism region. From each observed deflection v/e calculate a 
mode index in the prism region, and from each mode index we 
calculate, using a program similar to that used to determine 
the lens thickness profile, a value for the refractive index 
of the overlay material. 

The unweighted average of 16 such determinations, 
measured on 6 evaporated prisms varying between 0.28 and 1.21 
pm in thickness, was 2.446 ± 0.006. Two different substrates 
were used, and the modes observed were 6 TEq, 6 TMq, 3 TE^ and 
1 TM-j . The range of refractive indices found was 2.38 to 
2.48; while this is larger than desirable, most of the values 
clustered well around the mean. The mean for the 7 TM modes 

was 2.445, while for the 9 TE modes it was 2.447. The value 
2.445 was adopted for design work on evaporated lenses not 
to be subjected to ultraviolet illumination. 


A similar experiment was carried out using a sputtered 
prism 0.93 urn thick. A single TM beam was observed in the 
output when TM-polarized input was used, and two TE output 
beams were found with TE input. From the deflection of the 
TMq beam, a film index of 2.61 ± 0.005 was calculated. To 
obtain similar values for the beams with TE polarization, 
it was necessary to assume that they corresponded to excita- 
tion of the TEg and TE^ modes in the prism region; it was 
verified that a prism of this thickness would support these 
two additional modes. The film indices obtained on this 
assumption were 2.63 and 2.56. A value of 2.58 was adopted 
for the design calculations on sputtered lenses; this value 
will also be seen to be appropriate for evaporated lenses 
which are annealed or exposed to ultraviolet. The absence 
of deflected beams corresponding to the TEq and TE^ modes 
in the prism may indicate that these beams were absorbed or 
very effectively scattered into the higher-order modes or 
out of the waveguide. The high refractive index of the 
sputtered films is most easily explained by assuming that 
they have undergone during their formation a maximum photo- 
induced refractive-index increase as a result of the large 
amount of blue and ultraviolet light present in the sputtering 
glow discharge. 

3 . Ultraviolet-induced change in refractive index. The 
dynamic photoinduced index was determined by measuring the 
change in the deflection of a guided TM q beam by an evaporated 
As^S^ prism as the prism was illuminated with an ultraviolet 
lamp. This prism again had apex angle 30°; it was 0.60 pm 
thick. The single-mode waveguide on which it was placed had 
polished end faces. The input coupling prism and the crystal 
were all enclosed in a plastic box through which dry argon 
flowed throughout the experiments. The ultraviolet illumina- 
tion, strongest at 400 nm, impinged on the prism through the 
top of the box. The intensity of the 400 nm line at the 
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sample, with the box cover in place, was 1.10 mW/cm^. There 
was a small change in refraction by the prism between the time 
it was first made and when it was used in these experiments. 
This preliminary change in film index was equivalent to about 
10 minutes extra exposure to the uv source. For exposure 
times up to about 3 hours, the refractive index of the As^S^ 
as measured by the deflection of the TMq mode may be expressed 
as 


n = 2.5820 - 0.1295 exp (-E/5.739), (35) 

p 

where E is the total uv exposure (400 nm line) in J/cm . 

This expression is consistent with a model in which the rate 

of photoinduced index change is simply proportional to the 

remaining amount of unaltered material. For longer exposures, 

the film index increases above the saturation value indicated 

by Equation (35) to 2.594. This effect is probably related to 

heating of the sample by the uv lamp. The prism region also 

supported a TM-j mode; the dynamics of the index change 

measured using deflection of this mode were very similar to 

those with the TMq mode, but the calculated initial film index 

for this mode was about 0.02 higher than for TMq. After 

6 hours exposure, a TM^ mode of even higher apparent index 

showed up. The observed index changes and model curves fitted 

to an equation of the type of (35) are shown in Figure 21. 

While heating of the sample seems to be an important influence 

on the values obtained at long times, the displacement of the 

curves from one another should not go unremarked. This 

difference might result from some dependence of film composition, 

(23) 

and consequently photosensitivity^ ', on evaporation time, but 
there is also greater uncertainty involved in determining the 
film index from the properties of the higher-order modes. 
Refractive index increases similar to those produced by 
exposure to uv can also be induced by annealing^’^. We 
obtained similar changes in refractive index by annealing 
films at 190 C for 1 hour in slowly flowing dry argon or 
nitrogen. 


) 
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Figure 21 . Apparent refractive index of AS2S3 film prism as function of time 
of exposure to 1.1 mW/cm^ of ultraviolet light, as determined by 
deflection of 3 different modes supported in prism region. Heavy 
lines connect data points; light lines are fits to data of the 
form nf = a - b exp(-t/t). 



Sputtered films were exposed to ultraviolet in a similar 
way for up to 12 hours with no significant change in their 
refractive properties. Annealing as described above also had 
no effect. This is consistent with the idea that these films 
as made have about the maximum refractive index possible for 
films of their composition. 

9. LENS DESIGN 


To describe the lens design, mask design, and ray-tracing work, it 
will be helpful to follow through a single example. We will consider a lens 
of diameter 12 mm designed to focus a 10 mm wide TM^ beam at a distance of 
21 mm, measured in the waveguide, from the lens center. The guide will be 
taken to be the one described earlier, with surface ordinary index of 2.300 
and TMq mode index of 2.2892. The lens deposit, either sputtered or 
evaporated and annealed, will be taken to have a refractive index of 2.58, 
and the design wavelength in air is 633 nm. The design speed is f/2.1 in 
terms of the design useful aperture, or f/1.75 in terms of the total aperture. 
In terms of the back focal length often used by lens designers, the corre- 
sponding values are f/1.5 and f/1.25. 

To determine the required refractive index profile, we solve the 
integral equation^’^ 


(t - z) 


1 /: 


In 


Mil - I 


U 


ext 


sin (u 2 + z) 

, 2 . , , 1/2 

(u + 2z) 


du . 


(36) 


In this equation, N(r) is the mode index at radial distance r in the lens 
region, while N gxt is the mode index outside the lens, 2.2892 in our example. 
The parameter t is the reciprocal of twice the ful 1 -aperture f/number of the 
lens, or 0.2857, and z = tR, where R = 2r N(r)/A N . . A is the full lens 
aperture, 12 mm. In the form presented here, the integral is easily evalu- 
ated to 5 decimal place accuracy by a single 16-point Gaussian quadrature. 

The parameter R will be seen to range between zero and unity. To determine 
the index profile, we select a suitable set of values of R and for each one 
evaluate the integral; each evaluation yields a value for N( r )/N ext > and 


52 



from this we finally determine from the definition of R the value of r to 
which the calculation applies. The refractive index profile is shown in 
Figure 22. 

To find the lens thickness profile corresponding to this 
refractive-index profile, we take the waveguide to have a Gaussian profile 
with a diffusion length of 2.2 pm, a surface index of 2.300, and a bulk 
index of 2.2868. The values other than the surface index are not highly 
critical to the accuracy of the calculations. Since the lens thickness 
varies slowly with radius, we model the situation at a given radius as a 
uniform layer of lens material covering the inhomogeneous waveguide. We 
made a straightforward extension of the calculation method devised by 
Southwell (28) this case, assuming that the waveguide layer in the LiNbO, 
could be described within the WKB approximation. In most of the lens 
region, the major portion of the optical energy is drawn into the lens 
layer, and the electric field of the optical wave decays throughout the 
substrate. Under such circumstances, a simplified three-layer waveguide 
approximation may be used. This analysis applies, strictly speaking, only 
to TE modes, except of course that index values appropriate to the TMq mode 
in the LiNbO^ are inserted into the calculation. While this is in most 
circumstances a fairly good approximation, it should be corrected for more 
accurate lens design^9) ; as W e shall see, the present design happens to 
be one in which this part of the analysis is inadequate for peripheral rays. 
The calculated thickness profile, normalized to unity at the lens center, 
is shown in Figure 23. The design central thickness is 0.39 ym. 

10. MASK DESIGN 


To design a set of shadowing masks which will yield a lens 
deposit of the desired thickness-vs-radius profile, we need suitable models 
of i) the molecular detachment process at the evaporation on sputtering 
source, ii) the transport through the masks to the substrate, and iii) the 
deposition process at the substrate. Given these, we then require a 
straightforward but tedious search procedure to determine the positions 
and apertures of the necessary masks. In devising the models and computa- 
tional procedures, we have drawn on previous work on mask design for 
sputtered (30-32) anc j evaporated (33) lenses for glass substrates, although 
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Figure 22. Design mode-index profile for f/2.1 Luneburg lens of 10 mm input 
aperture, designed to focus TMg mode of LiNbOjtTi waveguide. 
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Figure 23. Design lens profile (dashed line) and approximation (solid line) 

with mask arrangement described in text for example f/2.1 Luneburg 
ens 



with a number of modifications. Since we are dealing with a smaller index 
difference between the deposited material and the substrate than are those 
using glass waveguides, it seems we should not need quite as complicated 
mask arrangements in order to obtain good quality lenses. 

The evaporation source is modeled as a uniform distribution, 
over a circle whose diameter is that of the top of the crucible, of point 
sources, each of which emits As 2 S 3 molecules (or equivalently As 4 S 6 
molecules (34)) uniformly into the hemisphere above the source. A Lambertian 
distribution (weighted by the cosine of the angle from the normal to the 
source surface) is often ( 33 ) assumed instead of a uniform distribution; 
but since in our system the actual melt surface is well below the assumed 
source plane, a uniform distribution seems an equally valid assumption. In 
any event, since the evaporation source is relatively small, only small 
angles are involved, and the two models are difficult to distinguish. 

The sputtering source is similarly modeled as a uniform circular 
distribution of point sources, now assumed, however, to emit with a 
Lambertian distribution, as often observed experimentally ^ 3 ^, and as 
usually assumed^) in the absence of better information. Yao assumed 
a source of infinite extent, but since our sputtering target is relatively 
small and close to the substrate, we have found it necessary to take its 
finite size into account. 

The particles emitted from the source are assumed to travel in 
straight lines to the substrate. If a particle hits a mask, it is assumed 
to be deposited permanently there. The sticking probability at the sub- 
strate is taken to be independent of the film thickness and independent 
of whether the particle condenses on the substrate material or on the film. 

We design the mask arrangement to obtain as good as possible an 
approximation to the relative shape of the deposit, normalized to unity at 
the center of the lens, and rely on a separate measurement of deposition 
rate to get the central thickness correctly. The masks for the evaporation 
work are made of thin sheets of aluminum with holes punched in them. For' 
the sputtering experiments, where space is more limited, we adopted the 
idea of Zernike ( 36 ) and Yao and Anderson ^ 32 ^of making the masks by milling 
conically tapered segments in aluminum plates. 

The relative lens thickness at a point a given radius r from the 
center may be calculated by integrating the flux arriving at this point 
from the observable area of the source and dividing by the integrated flux. 
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similarly calculated, at the lens center. The area of integration in the 
plane of the source is bounded by arcs of circles which are the projections 
of the mask segments on this plane and possibly also by the edge of the 
source. The complexity of determining the boundaries of the integration 
area for each mask arrangement and each value of r makes evaluating the 
integrals analytically quite difficult. Consequently we have adopted a 
numerical procedure, specifically a statistical procedure, the method of 
equidistributed sequences ^ . The area of integration is conveniently 
taken to be a rectangle closely bounding the observable region; unit functions 
in the integrand are used to reject points within the rectangle but not in 
the observable region. 

An equidistributed sequence of points in the interval [0,1] is 
defined as an ordered set {x.., i = 1,N} which is determined such that 

11m irl f(x ,-> * f(x)dx 07) 

N-*» 1 " 1 


for all reasonably well-behaved functions f(x). It can be shown that 
such a sequence may be generated by taking the decimal parts of successive 
integral multiples of any irrational number, such as < ,r 2. Integrals over 
any finite limits may be evaluated using (37) by appropriate scaling, and 
multidimensional integrals are easily handled by using an independent 
sequence for each dimension. Simple numerical tests we have carried out 
indicate the method of equidistributed sequences is at least five times as 
fast as a Monte Carlo integration of comparable accuracy. For the present 
problem, evaluating the integrand at 2000 points proved sufficient to give 
the integrals to 3 decimal place accuracy with high probability. 

To automate the design process somewhat, one mask aperture was 
allowed to take on a range of values; for each value the calculated lens 
profile was compared to the design profile at 5 or 10 interior points. A 
least-squares comparison was used, although a minimax criterion might have 
been somewhat more useful. The output was examined in detail in all pro- 
mising cases and further adjustments were made by hand until what appeared 
to be a satisfactory approximation to the design profile was reached. 
Experiment and ray-tracing calculations have generally shown that while our 
designs so far are satisfactory near the center of the lens, they sometimes 
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fail near the periphery. For the example design we have been discussing, 
we found after extensive calculation with up to four masks that a simple 
two-mask arrangement fit the design as well as any. In this arrangement 
the lower mask has an aperture 12 mm in diameter and is positioned 0.04 mm 
above the substrate, while the upper mask, with an 8.3 mm aperture, is 
positioned 4.8 mm above the substrate. We found experimentally that 
separating the lower mask from the substrate with aluminum foil improved 
the coupling into the lens. This mask arrangement is easily created by 
milling an aluminum plate. The calculated profile for this mask arrangement 
is compared with the design profile in Figure 23. For some other sputtered- 
lens designs, mask arrangements that taper primarily the other way — that is, 
opening toward the source — are more suitable, while for evaporated lenses, 
small apertures 25 to 30 mm below the substrate are required with our 
experimental arrangement. 


11. RAY TRACING 


The only way to determine what happens when a Luneburg lens 
departs from the ideal shape, or varies from the design refractive index, 
is to trace a sufficient number of rays through the lens to see what happens 
to the focal spot. We expect, of course, a change in focal length and a 
decrease in focal spot quality, but these changes have to be evaluated 
quantitatively in order to determine the tolerances required to meet the 
specifications of a particular application. Specifically, we can hope to 
learn from ray tracing 

(1) the adequacy of our lens design procedures 

(2) the adequacy of our mask design procedures 

(3) the degree to which the physical properties--e.g. , the film 
refractive index — of the lenses we have made are sufficiently 
similar to those assumed in the designs 

(4) the effect of variations in experimental parameters-- 
e.g., the lens central thickness — on the focal spot quality 

(5) some aspects of the overall behavior of Luneburg lenses 
which may not be intuitively obvious. 

Ray tracing thus plays an important role in closing the loop 
between design and fabrication. 
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Before discussing some ray-trace work applied to our representative 
design, we will describe one way in which ray tracing can help improve our 
understanding of Luneburg lens behavior. We made a series of evaporated 
lenses with the same mask arrangement, but with different evaporation times, 
so some lenses had close to the design central thickness while others were 
considerably thinner. One might naively expect that lenses thinner at the 
center, with lower mode indices and less steep gradients, would have longer 
focal lengths than thicker ones, but we found a considerable range over which 
the thinner lenses had reduced focal lengths. This is illustrated in 
Figure 24, where experimental points are indicated by circles and the design 
focal length and thickness are shown at the rightmost "X". The other two 
X's indicate the results of ray-trace analyses through lenses with the 
design index profile but reduced thickness. Clearly, the reduced focal 
lengths are what one should expect under these conditions. Examination of 
the ray diagrams shows that rays passing through different parts of the lens 
cross the axis at points which, depending on the lens thickness and the 
entrance coordinate, may be either in front of or behind the design focal 
position. Thus, the apparent sharpest focal point depends on the aperture 
used in a complicated way. The ray-trace also indicates that the lens 
indicated by the leftmost X in Figure 24 should have a focal spot smaller 
than the diffraction limit at apertures corresponding to f/4.8 or smaller. 

The sidelobes might not be low-- the diffraction pattern was not calculated-- 
but these simple geometrical optics calculations indicate the possibility 
of obtaining good quality lenses for some purposes with designs that vary 
markedly from the conventional Luneburg contour. 

Now we turn to our design example, which we recall is for a lens 
to focus a 10 mm wide TMq input beam 21 mm from the lens center. First we 
consider a ray- trace through a lens with the design thickness profile shown 
in Figure 23. Some representative rays are shown in Figure 25. It is 
immediately apparent that a good focus, though slightly short of the 
design focal length, can be obtained at somewhat reduced aperture; but that 
peripheral rays are not focused well. Both the change in focal length and 
the long focal length of the peripheral rays are results of not using the 
correct boundary condition for the TM mode, as discussed earlier. This 
easily remediable deficiency in the design program has never been corrected 
since it never led to serious aberrations in earlier designs. Since there 
is little change of getting this design to work well at full aperture, we 
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Figure 24. Measured (circles) and calculated (x's) focal lengths for Luneburg lenses 
of different thicknesses made with the same mask arrangement. 
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Figure 25. Ray trace through lens with example design thickness profile 
shown in Figure 23. 



will consider its properties at half aperture. The focal length for a 5 mm 
wide input beam is 19.8 mm and the focal plane diffraction pattern is shown 
in Figure 26. The graphics have not been changed from those for conventional 
two-dimensional lenses; a traverse through the center of the plot parallel to 
either axis is a good approximation to the Luneburg lens focal-plane intensity 
distribution. The vertical scale is linear. All the diffraction patterns 
presented in this report should be considered only as qualitative representa- 
tions, since not enough rays have been traced to obtain a completely accurate 
quantitative picture. Nonetheless, one can see that at the reduced aperture 
the lens should focus quite well, with a focal spot size in the range of the 
diffraction limit. 

To investigate the result of a small change in design parameters, 
let us consider the effect of a reduction in the refractive index of the 
lens material from 2.58 to 2.573. This amounts to a change of 2.5% in the 
difference between the film index and the waveguide surface index. A trace 
of representative rays for this case is shown in Figure 27. The ray plot 
is quite similar to the previous one, but there are discernible differences. 
The focal length at 5 mm aperture has increased slightly, to 20.2 mm. The 
diffraction pattern. Figure 28, at this aperture still nas a sharp focal spot, 
but the sidelobes are increased somewhat. Increasing the lens central 
thickness from 0.39 to 0.40 pm produced effects of comparable magnitude. 

The calculated focal length at 5 mm aperture shortened to 19.1 mm and the 
focus became a little sharper. 

In Figure 29, we present a trace of representative rays through a 
lens with the profile attainable with the mask design described in the 
previous section. The lens shows a fairly sharp focus at apertures up to 
6 mm, but peripheral rays are again only weakly focused. The focal length 
at 5 mm aperture is reduced to 18.3 mm. The variation of focal length with 
small changes in lens shape is large enough that it appears that in applica- 
tions such as collimation where precise focal length control is important, 
it is highly desirable to have a means of adjusting the focal length after 
fabrication. The focal -plane diffraction pattern for this lens. Figure 30, 
shows that the central focal spot remains fairly sharp, but the sidelobes 
are considerably increased. 

All the ray tracing work described here was done by Professor 
Duncan T. Moore and C. Benjamin Wooley at the University of Rochester 

Institute of Optics under subcontract to Professor Moore's firm. Gradient 
Lens Corporation. 
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Figure 26. Focal -plane diffraction pattern for lens ray-traced in 
Figure 25. The diffraction pattern for a Luneburg lens 
is given approximately by a section parallel to a coordi 
nate axis. 
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Figure 27. Ray trace through a lens similar to that in Figure 25, but 
with reduced film refractive index. 
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Figure 29. Ray trace through a lens with profile, shown in Figure 23, 
attainable with simple mask arrangement. 
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Figure 30. Focal plane diffraction pattern for lens 
ray-traced in Figure 29. 
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12. lens fabrication and testing 


Numerous lenses have been made and tested. The fabrication 
procedures have already been described. We frequently departed in one 
respect or another from the nominal design procedures in order to investi- 
gate experimentally effects of variations in material and process parameters 
such as were investigated theoretically in the previous section. 

The principal experimental data obtained on the lenses were the 
focal length and, in relatively good lenses, the focal spot quality. In 
most cases the lenses focus outside the waveguide. To determine the focal 
length, we measure the length of the optical path in each medium— waveguide, 
output coupling prism if used, and air — from the lens center to the focal 
point, and convert the total distance to an equivalent distance inside the 
waveguide. The accuracy of this procedure is not very high; our quoted 
focal lengths might easily be in error by ±10%. 

Our primary method for characterizing the optical quality of the 
Luneburg lenses is the examination of the light distribution in the focal 
plane. The focal spot is scanned by coupling the beam transmitted through 
the lens out through a rutile prism and refocusing it with an f/2 imaging 
lens onto an optical multichannel analyzer (OMA). The experimental arrange- 
ment is shown in Figure 31. The OMA has 500 25 pm channels on 25 pm 
centers. The channels are long enough to collect substantially all the 
light diffracted or scattered in the direction perpendicular to the wave- 
guide plane. The OMA output can be displayed on an oscilloscope screen or 
recorded digitally. In some instances, a Reticon diode array has been used 
instead of the OMA. 

OMA scans of the lens focal spots have generally been made at 
reduced input aperture in order to be sure that all the light transmitted 
is captured by the relay lens and focused on the OMA detector. The 
diffraction patterns do not sharpen markedly at larger apertures, but we 
cannot presently say how much of this effect results from poorer quality 
of the lens near the periphery and how much from aperture effects in the 
light-collection system. The focal spot quality does not vary in any 
marked or predictable way with lens thickness. The spot quality data 
may be evaluated by noting that for 5 mm aperture and 21 mm focal length, 
the half-power diffraction-limited spot size for the TMq mode at 633 nm is 
2.3 pm. For an ideal lens, the first sidelobes should be 13.3 dB down in 
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Experimental arrangement for observation of Luneburg 
lens focal plane characteristics. 
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intensity from the central peak. These two numbers, focal spot size and 
first-sidelobe intensity, do not fully characterize the lens quality, of 
course, even at fixed input aperture. It should also be borne in mind that 
the input beam is a truncated Gaussian rather than an ideal plane wave. 

From among our results on evaporated lenses, we will present 

representative data on just two, one tested as made and the other exposed 

to ultraviolet light to increase its refractive index. Additional details on 

a number of evaporated lenses are provided in our previous year's report^**). 

These lenses were again 12 mm in diameter; they were formed by evaporating 

arsenic trisulfide through a mask with a 5.0 mm diameter aperture held 

28 mm below the waveguide and an 11.9 mm aperture edge-defining mask placed 

0.5 mm from the guide. The design is intended to focus a 10 mm wide TM_ 

« 0 
beam at a distance of 30 mm from the center of the as-prepared lens. 

A Reticon diode array scan — 25 pm detectors on 50 pm centers — for 
one as-prepared lens is shown in Figure 32. This lens is 0.70 pm thick, 
compared to a nominal design value of 1.69 . pm. It has a measured focal 
length of 25 mm at 10 mm aperture. In the scan, which was made at 4 mm 
aperture, the central spot is 2.3 pm wide, compared with a diffraction- 
limited value of 1.5 pm at this aperture, and the first sidelobes are 
11 dB down. This represents one of our better as-prepared lenses. The 
marked difference in thickness from the design value seems to have little 
effect on the focal spot quality, although it does affect the focal length. 
The ray-tracing work, though, indicates that often even when the focal spot 
is sharp, the phase deviations in the focal plane are of far from standard 
form. This effect, a form of the phenomenon referred to(39) as "spurious 
resolution", can have serious consequences if the lens is to be used for 
optical data processing. 

The Reticon scan shown in Figure 33 is for a lens which was 
exposed to ultraviolet light until its refractive index was increased to 
around 2.58. This lens is 1.2 pm thick and has a focal length of 18 mm. 

The focal spot is 2.9 pm wide, or 2.7 times the diffraction limit, and the 
first sidelobes are more than 13 dB down. As with altering the thickness 
of the as-prepared lenses from the design value, increasing the film 
refractive index changes the focal length without changing the focal-spot 
characteristics as markedly as one might expect. The same remarks about 
phase deviations apply. 
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Figure 32. Diode-array scan of focal plane of as-prepared 
evaporated lens. 
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Diode-array scan of focal plane of evaporated 
lens exposed to ultraviolet light. 


Luneburg lenses need not be circular in shape as viewed from 
above (40). We prepared by evaporation some As^ lenses of more nearly 
rectangular shape and measured their characteristics. Our work in this 
area is described in our paper "Rectangular Luneburg-type Lenses for 
Integrated Optics", item 3 in Appendix D. This paper and item 1 in that 

list also contain additional data on focal plane scans. 

Only preliminary tests of information handling capacity of the 
lenses have been made. In one experiment, the lens was deposited on a 
large substrate and broad-band tranducers were arranged to generate surface 
acoustic waves in the lens input plane. The SAW frequency was swept from 
280 to 410 MHz and a digital word generator was used to sample periodic 
portions of this range. For example, "on" segments 6.2 MHz wide separated 
by 6.2 MHz "offs" were readily resolved photographically. Rays to the 
centers of adjacent "on" segments have an angular separation of about 
0.5 mrad. The spatial frequencies are separated by about 0.5 pm. The 
ultimate resolution is clearly higher, but unfortunately this lens was 
accidentally destroyed before testing could be completed. These results, 
while encouraging, should not be weighed too heavily in view of the known 
phase aberrations in many of our lenses. 

Initial experiments on sputtered lenses were also encouraging. 
These experiments were started before designs for the masks were completed, 
so a variety of masks were made on the basis of rough guesses about useful 
shapes and tried out. Masks with a single conical taper, similar to those 
used by Zernike (^<5), anc j masks W1 'th a double conical taper, like those 
described by Yao and Anderson (31), were both used, as was a mask with a 
conical plus a cylindrical section. The mask apertures were made by 
drilling through 4.8 mm (3/16") thick aluminum sheet. In all these masks, 
the aperture was 6.75 mm in diameter at its narrowest point. The lenses 
are approximately 11 mm in diameter and either 0.5 or 0.6 pm thick at the 
center. While all the lenses focused 633 nm guided light to some extent 
those made with the single conical masks had relatively diffuse focal spots. 
One lens made with a double-cone mask and 0.6 pm thick had a focal length 
of 25 mm inside the waveguide. A 0.6 pm thick lens made with a cylinder- 
plus-cone mask also had a 25 mm focal length. The lens deposits appeared 
to be of good quality. 

After the sputtering system had been in operation for some time, 
though, the quality of the lenses deteriorated markedly. The difficulties, 
which appeared in lenses of all sizes and shapes, may be summarized by 
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saying that thicker lens deposits tended to absorb or scatter the incident 
light very strongly, so no output beam could be detected, while thinner 
deposits did not refract as strongly as expected, focal lengths referred to 
the waveguide material being 60 mm or more. Varying the sputtering 
conditions, replacing masks and shutters, using high-purity argon instead 
of standard laboratory grade, and cleaning and overhauling the system had 
no effect. Presputtering for 1 to 2 hours seemed to help somewhat as did 
raising the mask slightly above the waveguide as we have previously des- 
cribed. Microscopic examination of some of these later films showed 
scattered platelets a few micrometers in diameter adhering to the surface 
but similar-looking platelets observed on some evaporated films did not 
affect their properties markedly. This problem is unresolved at this 
writing, but chemical analyses, to be described shortly, have provided 
some helpful diagnostic information. Thus while we anticipated sputtering 
would be a more controllable and reproducible process than evaporation for 
fabricating the lenses, it has not so far proven to be so. 


13. CHEMICAL ANALYSES 


In view of the differences between evaporated and sputtered films, 
and between "early" and "late" sputtered films, it seemed worthwhile to 
investigate the chemical nature of the films and of the raw materials from 
which they were made. The primary analytical technique employed was ESCA 
(electron spectroscopy for chemical analysis), since this method provides 
information on bonding, and thus enables one to determine what compounds, 
as well as what elements, are present. The film samples were uniform 
layers in the 0.4 to 1.0 pm thickness range, deposited on glass slides 
which could be cut to the approximate size for insertion into the ESCA 
apparatus. Examination of the films under magnification showed small 
platelets adhering to some areas of the sputtered films, particularly the 
more recently prepared films. The evaporated film had fewer such features. 
Robinson back-scatter electron micrographs of the films also showed 
numerous surface features, appearing at 500X like little balls, on the 
sputtered films. It could not be ascertained whether these were the same 
features observed visually, but it is suspected they are similar because 
they show similar tendencies to follow polishing marks and other such 
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imperfections in the underlying glass plate. The electron micrographs did 
not reveal whether the surface features were of different composition from 
the substrate, but ESCA experiments with the angle of the incident X-ray 
beam increased from 45 to 63° showed no significant changes in composition, 
indicating that these features probably have a composition similar to the 
bulk. One drawback of ESCA is that it provides information on the composition 
only of the top 3 nm of the sample, so other techniques are necessary if there 
is any reason to believe the composition may not be uniform through the 
sample. Also ESCA looks at the whole sample surface, and is not suited for 
studying variations from point to point on the surface. 

All the samples are very pure. The only contaminants detected were 

minor amounts of surface organics. In particular, there is no evidence of 

AS 2 O 3 or As^S^ molecules. ESCA data do not seem to be available for 

the other stoichiometric compound of arsenic and sulfur, but as there are 

no unidentified peaks, the likelihood of this material being present is 

small. These other compounds could be detected, if present, at a level 

of a few percent. Thus all the materials appear to have exclusively 

^ s 2^3 _t ^ e bonding at the local level. Integration of the areas under the 

peaks shows, though, that they are all more or less sulfur-rich, as Table 2 

(on the following page) indicates. The relative amounts of the constituents 

are presented in two ways: as the fractional amount x of As in material 

As S, and as the amount y of S in materials of formula As„S . Thus for 
x 1 —X c y 

pure glass* x should equal 0.40 and y should equal 3. The uncertainty 

in the values of x is around ±0.03, while the possible error in y is quite 
large, ranging from 0.5 to 1.7. There appear to be significant differences 
between the compositions of the films and those of the corresponding sources, 
as well as differences in the films themselves. To see whether a change in 
target compostion as the material is used up might play some role in the 
difference in sputtering results, we made an electron microprobe traverse 
across a freshly broken face of this sample. A similar traverse was made 
across a face of glass used as the evaporation source. Both samples showed 
considerably less sulfur in the interior than at the surface. The content 
seemed to vary smoothly through the samples. We have not attempted quantita- 
ive analysis of this data because previous attempts to perform such calculations 
for microprobe measurements on evaporated films did not appear to yield 
reliable results. Qualitatively, we can say that the sputtering source 
is slightly more sulfur-rich near the surface than the glass raw material 
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Table 2. Composition of Bulk and Film Samples of Arsenic 
Trisulfide Glass as Indicated by ESCA. 



As S, 
x 1-x 

As 2 S y 

Sample 

X 

y 

Glass, evaporated source 

.338 

3.9 

Glass, sputter-etched 

10 minutes to remove carbon 

.350 

3.7 

Evaporated film 

.280 

5.1 

Unused piece of a sputtering target 

.204 

7.8 

Sputtered film, early 

.226 

6.8 

Sputtered film, late 

.330 

4.1 
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for evaporation, while near the center of the samples tested the evaporation 
source was more sulfur-rich. We might speculate that the molten evaporation 
source has a composition close to variations in composition in evapor- 

ated films then resulting from decomposition during evaporation. Changes 
in sputtered films, on the other hand, might reflect changes in the compo- 
sition of the exposed surfaces of the target. Clearly, though, considerable 
additional work is necessary if these ideas are to be verified or otherwise. 

The analytical work described in this section was performed by 
Julius Ogden, Doyle Kohler and Carl A. Alexander of the Battel le-Columbus 
Physico-Chemical Systems Section. 


14. DISCUSSION AND CONCLUSIONS 


In the first part of this program, we were able to show that 
short-focal -length Luneburg lenses of good quality could be produced by 
depositing arsenic tri sulfide films on LiNbO^ waveguides. One principal 
difficulty was with variability of the properties, particularly the refrac- 
tive index, of evaporated films. This was one reason for trying sputtering 
as a possible alternative deposition procedure. Sputtered films to-date, 
.though, have shown a larger range of properties, and have been more difficult 
to control, than have evaporated ones. We have suggested, on the basis of 
limited chemical analyses, that inhomogeneity in the material forming the 
sputtering target may be responsible. This suggestion is at present very 
speculative, of course. 

Improvements have been made in the remainder of the design and 
fabrication procedure, so if reliable film deposition procedures are 
developed, it should require primarily some process refinement to permit 
production of useful lenses on a regular basis. 

Reliability and reproducibility of the lens production process 
can only be defined, though, with respect to some particular design objective. 
It is a good general objective to aim, as we have done, at fabrication of 
large diameter, short focal length, diffraction-limited lenses, but it is 
more appropriate in a particular case to investigate how the deviations 
from the nominal design which are likely to occur affect the lens performance 
in its designed role. It is important to bear in mind that the point sources 
and plane waves of conventional design are idealizations which are seldom 
appropriate for contemporary integrated optics devices. 
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Our present judgment is that the most reliable way of making 
arsenic tri sulfide thin-film lenses is thermal evaporation followed by 
annealing in an inert atmosphere( 25 ). This opinion is based on only a few 
experiments, though, and considerable additional experimental work including 
more chemical analysis would be desirable. Surprisingly little analysis 
of arsenic tri sulfide films has been done anywhere. 

As far as improvements in the remainder of the design and fabrica- 
tion process go, the primary requirement is closer integration of the ray- 
tracing work into the design and fabrication process. Ray tracing has pro- 
vided valuable information, but it has not always been obtained at the 
most suitable times, mainly through failure to recognize how helpful it 
was going to be. Other design improvements are in the nature of refine- 
ments. We have mentioned difficulties in some lenses with focus of peripheral 
rays, passing through the thinnest part of the lens. These problems can 
be alleviated by designing for still larger lens diameters and using only 
the central portion of the lens. If space on the substrate becomes a 
problem, rectangular- or lenticular-outline lenses can be used. Tolerance 
requirements for these large lenses remain to be investigated, though. 

In testing of the lenses, the only improvement that may be needed 
is in the profilometry of the lens shape. In situations where close 
tolerances must be maintained, uncertainty concerning curvature of the 
substrate below the lens makes sufficiently accurate measurement difficult. 
While there are a number of things that can be done, there is no easy way 
around this problem. 

If a good film deposition process is developed, these other design 
and process refinements should permit the fabrication of arsenic tri sulfide 
lenses suitable for many beam-forming and signal -processing requirements . 
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APPENDIX A 

THE NONLINEARITY PROBLEM 
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THE NONLINEARITY PROBLEM 


A major problem with most components available for analog optical 
computation is that they are not linear devices. Thus, if we have a signal 
voltage V s , applied to most common optical modulators the output light inten- 
sity will typically be 

I = I 0 sin^ a V s OA) 

where I 0 is the incident intensity and a is a constant characteristic of the 
modulator. In most cases we would, of course, prefer that the output be 

I = const x V. (2A) 

There are a number of approaches to achieving this end. The most desirable 
is to develop d linear modulator. We understand that there are some promis- 
ing developments along these lines, but we are unable to comment further at 
this time. Other approaches are signal preconditioning, modified detection 
schemes and operating in a binary mode. 

Each of the three approaches to overcoming the intrinsic nonlinearity 
for the electrooptic grating modulator is discussed in this Appendix. Signal 
preconditioning is an analog method which is best applied to slowly-varying 
signals. In the case of the fully-parallel matrix-vector multiplier, in 
which the vector components are rapidly-changing data and the matrix elements 
represent a slowly-varying set of system equations, it would be natural to 
handle the matrix elements with the analog signal preconditioning technique. 

The modified detection scheme involves frequency-shifting one of 
two optical beams whose intensities are to be multiplied. The multiplication 
takes place on a square-law detector and takes advantage of its properties to 
extract the desired product. Matrix multiplication architectures using this 
technique have not yet been devised. 

Operation in a binary mode involves using one side of the herring- 
bone electrode structure as an (electrical) digital-to-(optical ) analog con- 
verter. The optical analog signal is then modulated (multiplied) by the 
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second half of the herringbone. This approach results in a reduction in the 
size matrix which a given IOC can handle since an N-bit (E)D/(0)A converter 
requires N IOSLM elements. However, the ability to accept a direct N-bit 
parallel input is a significant advantage. 


Signal Preconditioning 

Assume a modulator M where transmission T varies in a nonlinear way 
with the applied voltage V, so that, for example 

T = sin2(aV) . (3A) 

To set T to some desired value x, 0 <. x < 1 , we must apply the voltage 

V(x) = (sin-^/a (4A) 

This voltage can be generated using the circuit shown in Figure 1A. We use 
an electrooptic modulator, M to "model" the real modulator. Applying V 
yields IT. The detected signal, bT, is proportional to T with a readily 
measurable proportionality constant, 6. A comparator between BT and BX 
drives the circuit through feedback to 


T - BX = 0. (5A) 

When this condition obtains, 

V = V(x) (6A) 


is available for use. The "decay" from the initial V, say V = ^/2a for 
which T = 1, is exponential with the time constant limited only by comparator 
speed. A time delay must be built in so that only the steady state V is 
applied to the modulator we wish to control. It remains to be seen if such 
an approach is sufficiently rapid to justify its use. 
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An Optical Method for Linear Analog Multiplication 


Assume again that the modulators generate the output given by 
Eq. (1A). We will show that, by introducing an appropriate modulation signal 
we can extract an electrical signal proportional to V-j V 2 - 
Consider the outputs of two modulators 

I] = I 0 sin 2 aV] 

12 = Io sin 2 aV 2 (7A ) 

If aVi , aV 2 «l , then the output intensities and the corresponding amplitudes 
are* 


11 = (aV]) 2 I 0 ; A] = aV-jAo ei( w+6 )t 

1 2 = ( aV 2 ) 2 I 0 ; A 2 = aV 2 A 0 eiwt (8A) 

where w is the optical frequency and 6 is an r.f. frequency shift imposed 
on one of the beams. 

If the two beams are now combined on a square-law detector, the 
resulting signal is 


S = const x A]ei( w+<s )t + A 2 e 1w1: ^ 

2 2 

= A-| + A 2 + A]A2COs6t 

= D.C. term + a 2 V] V 2 lo c os 6 t. (9A) 

Therefore by detecting the A.C. term we get a signal which is linearly 
proportional to V-j V 2 - 

A method for accomplishing this is shown in Figure 2A. This can 
be implemented either in bulk or integrated optical form. 
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D/A Conversion 


An optimum method for overcoming the nonlinearity problem when 
input data are available as parallel binary words is to use an appropriate 
D/A converter. We present here the first experimental results of such a 
device, an (electrical) digital-to-(optical ) analog converter,* and show how 
it can be used in an integrated-optic multiplier. 

The D/A converter is fabricated upon a planar, single-mode Ti- 
indiffused Li Nb03 waveguide. The active element is an electrooptic inte- 
grated optic spatial light modulator (IOSLM) which is simply an extended 
interdigital electrode structure composed of a number of separately address- 
able segments. The electrode segments are addressed in parallel with the 
voltages respresenting the digital word to be converted. 

In the configuration tested, it is essential that a digital "zero" 
be represented by a zero voltage and that all digital "ones" be represented 
by a voltage, V. As shown in Figure 3A, the voltages representing the digi- 
tal word are applied to the electrodes through a voltage divider. The divi- 
ders are set so that the voltage V, when representing the most significant 
bit, results in the diffraction of an optical power which we may represent 
by p max- T be next divider is set so that the diffracted power is P ma x/ 2 > the 
next to generate Pmax/4» and so on. The total optical power diffracted by 
the structure is therefore the optical analog representation of the electri- 
cal digital input. This optical analog signal may then be used as the input 
to an analog optical device such as a multiplier, or a lens can be used 
to direct all of the diffracted light to a photodetector in which case the 
electrical analog signal is generated. 

Figure 4A shows the results of a simple proof-of-principle experi- 
ment which was set up by uniformly illuminating the IOSLM with a prism-coupled 
guided plane wave. The diffracted light was collected by an external lens 
and directed onto a photodetector. The voltage dividers were individually 
set as described above, and the system was stepped manually through the digi- 
tal words 000000 to 111111 by the use of toggle switches. The figure shows 


* Developed under AF0SR support on Contract Number F49620-79-C-0044. 
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Figure 3A. The use of voltage dividers to bias the elements of an IOSLm 
so that the entire structure performs D/A conversion. 
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the analog voltage generated by the photodetector as a function of the digital 
input word. As can be seen, the system functioned as expected. The kink in 
the otherwise straight line is thought to be due to a slight missetting in 
one of the voltage dividers. 

The high-speed performance of the integrated optic D/A can be esti- 
mated by assuming, for example, that a laser will be used which will result 
in a diffracted power of 50 microwatts from the most significant bit. In 
this case the maximum diffracted power, when all six bits are on, will be 
98.44 microwatts and the contribution of the least significant bit will be 
1.56 microwatts, a value which is -36 dB down from the maximum. It can be 
shown that, for direct detection of a 100 microwatt signal at a 100 MHz band- 
width, the signal-to-noise (SNR) of an optical detector is 60 dB- There- 
fore, the LSB can be detected with an excess SNR of 24 dB. This excess can 
be retained to achieve a minimum error rate, be used to increase the number 
of bits, increase the operating rate, or decrease the optical power. 

Figure 5A indicates how the integrated-optic D/A can be used as part 
of a herringbone structure to perform part of a vector-matrix multiplication. 
Each matrix element is represented by a 3-bit word. Each vector element is 
represented by an analog signal which has been linearized by the analog 
method discussed above. 

An IOC for matrix- vector or matrix-matrix multiplication by the 
engagement algorithm is shown in Figure 6A. Note that the penalty paid for 
using an N-bit D/A is an N-fold reduction in the dimension of the matrix 
which a given processor can handle. 
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Figure 5A. A herringbone structure used to form the product of two three-vectors, 
one represented by three 3-bit binary words, the other by 3 analog 
voltages. 
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Figure 6A. IOC for vector-vector or engagement matrix-vector multiplication using 
the hybrid digital-analog herringbone multiplier. 
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In order to handle arbitrary sized matrices with fixed sized optical matrix 
processors it is necessary to expand or contract the problem to fit the processor. 
Here we examine this preprocessing, show a quite general method, and apply it to the 
type of triple matrix product calculation needed for Kalman filtering. Emphasis 
will be placed on systolic type processors. 

The recent explosion of interest in optical matrix processors (refs. 1-6) need 
not be reviewed here except to note that even with spatial light modulators with one 
dimensional space-bandwidth products of 1000 or more, we may not be able to handle 
large matrices. Spatial dimensionality is used to allow representation of real or 
complex numbers, to achieve high numerical accuracy through binary representation, 
and to allow floating point calculations. As a result, we might find ourselves 
limited to working with relatively small matrices, say, 20 x 20. Call this processor 
dimension D. The problem we discuss here is how to match real problems to such a 
restricted processor. In all that follows we will illustrate with D=2 processors. 

The first step will be to expand the given matrix so that its dimensions are 
mD x nD. To do this we fill out the given matrix with zeros to the right and below. 
For D=2 and the given matrices 
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we expand to 
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It is easy to show 
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It is well known (ref. 7 ) that 
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Let us see how we can best order these calculations. Figure 1 shows an optical 
matrix processor and its supporting electronics. Clearly we need never do more than 
a D x D matrix at any time. One memory must store Ag and Bg. The partitioning elec- 
tronics then selects out of the memory the needed submatrices. 

Let A|r and Be be of dimension nD x nD. If we can afford n^ parallel D x D pro- 
cessors, we can use some memory-efficient approach such as the engagement approach 
shown in figure 2 . In many cases this will be impractical. The other extreme case 
is that of only one D X D processor. In that case we order A£ and Be submatrices in 
such a way as to calculate one submatrix at a time of the product matrix so all inte- 
gration occurs on the detectors. In our example, we calculate AnB^i first and 
then add to it on the same detectors A12B21* 
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One large and important type of matrix problem is the Kalman filter: a general 

and powerful estimation technique widely used in many areas such as automatic con- 
trol (ref. 8). The Kalman filtering process is recurring, interactive, ordered 
sequence of matrix inversions, additions or subtractions, and multiplications. The 
most difficult tasks are several triple matrix products of the form ABC. Let us now 
explore efficient ways of doing AeBeCe products. 

In the case in which we can afford n^ parallel processors, we finish calcula- 
ting the 1,1 component of AeBe just as we need it to multiply the 1,1 component of 
C E in the 1,1 multiplier, etc. Thus, except for time needed for electronic conver- 
sions, reformatting, and feedback (see figure 1), the calculation of A e B e Ce takes 
only 3N-1 single D x D multiplier clock times to evaluate rather than 2(2N-1) if 
AeBe were calculated fully before we begin to calculate AeBeCe . 

To accomplish AeBeCe calculation with all integration and memory taking place 
only at detectors we need at least n+1 DxD multipliers. The method is easy to 
understand. First we calculate the 1,1 element of AeBe on a single DxD computer. 
Then we broadcast it to the n computers which, in parallel, multiply it by the (1,1), 
(1,2), ..., (1,N) elements of Ce- Then we calculate the 1,2 element of AeBe, multi- 
ply it in parallel with the (2,1), (2,2), ..., (2,N) elements of C E , accumulate the 
sums on the detectors, etc. Because the calculations are likely to be systolic or 
engagement types, we can (as before) keep all parts of the system busy at all times. 
That is, the 1,1 component of the 1,2 component of AeBe will be available only one 
clock time after the D,D component of the 1,1 component of AeBe* Of course the n 
parallel processors are ready for each element from the single processor as it is 
calculated. To calculate all n^ components of AeBe takes n2 + n-1 clock times. 

Only n clock times later the whole matrix AeBeCe is calculated, so a total of n2 + 
2n-l clock cycles is needed. 

These considerations show that an expanding-partitioning-interleaving approach 
provides an efficient way to use DxD matrix multipliers to handle arbitrary sized 
matrices. The illustration of the triple matrix products so critical to Kalman fil- 
tering show in some detail how the calculations can be done while using only the DxD 
detector arrays for scratch pad operations (storage of intermediate results) . 
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Figure Captions 


Figure 1. An optical computer will contain more electronics than optics. 

This figure indicates perfectly some of the functions the electronics 
must service. 


Figure 2. Submatrices can be ordered in the same way as individual components 

for engagement processing (a) . The notation above for the particular 
case illustrated in the text can be further broken down in terms of 
individual matrix components (b) . 
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APPENDIX C 


NUMERICAL ACCURACY 

While optics will offer advantages over electronics in speed, size, 
power consumption, and problem size, it is known to suffer badly in accuracy 
comparisons. Two approaches are possible (so far as we know) to improve this 
situation. Both involve trading off some known advantage to gain back some 
accuracy. 

The first approach is to do "bit slicing". That is, each number will 
be represented by many optical signals rather than just one. If we use 16 
binary signals per number we can represent 16 bit numbers. Two independent 
researchers have come up with proprietary solutions to multiplying such 
numbers using optics with no better than 4 bit accuracy (good optical systems 
have 6 or 7 bit accuracy) . We can not disclose those schemes now, but we 
will be able to before the contract is over. This scheme buys accuracy by 
lowered speed (if the bit slices occur sequentially) or increased complexity 
(if the bit slices occur in parallel) . These prices are unfortunate but 
appear to be affordable because of the extremely large inherent speed and 
complexity advantages of optics over electronics when no bit slicing is used. 
The price is also comforting in the sense that we would be worried if nature 
appeared to give us something for nothing. Thus we must figure out the 
mini mum required accuracy and design for that to achieve maximum speed or 
minimum complexity. 

The second approach is to formulate the problem in such a way we can 
achieve a decreased need for accuracy if we make more calculations. We 
perform what we will call "approximation" and "reformulation" in sequence 
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many times. Suppose we want to find the x q satisfying f(x Q ) = 0 and close 
to x = 3. We evaluate f (3) and f’(3). We then find Ax^^ such that 

f (3) + f ' (3) Ax x = 0. 

We do not need to do this very accurately. This allows the first 
approximation 


x (1) - 3 + Ax. . 

o 1. 

We now reformulate by going back to the mathematically exact expression for 
f(x) to find f(x^) and f'(x^). We can then approximate again. 

Arbitrary accuracy in the final x is possible if the approximation accuracy 
is good enough to get closer each time. This approach (the example is 
called the Newton-Raphson method) "starts over" with each cycle but starts 
closer to the correct answer each time. In a control problem we can 

• use optics to calculate the control vector u, 

• use the approximate u to "correct" the system, 

• measure and infer the resulting state vector x, and 


• use optics to calculate the control vector to correct the 
system given the new state vector. 

From these discussions it becomes clear that a complex analysis must 
be undertaken to optimize the algorithm-hardware combination for any part- 
icular task. What NASA will require is the full set of algorithm and hard- 
ware variations along with the rules for making the tradeoff. 
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A DIFFERENT POLYNOMIAL EVALUATOR 


If the purpose of the polynomial evaluator is to represent an ideal plant 
in an active control system (rather than solving a polynomial) , it may be 
easier and numerically better to use a method other than Horner’s rule. 

In general the best fit to a function f(x) using polynomials of given 

m , m-1 , 

a x + a , x + • • • + a 
m m-1 o 

, n . n-1 , , , 

n n-1 o 

A general approach to finding the a’s and b’s is Pade approximation^. 

r\j 

Usually m = n ±1. In this case we will simply want to evaluate fCx)^ f (x) 

ran 

• given x very rapidly. We have no' interest in finding its roots. 

Likewise we may wish to integrate a differential equation of the form 


order is of the form 


f (x) 
m,n 


dy/dx = f(x,y) - f (x,y) . 

ran 

Here we can use a high order Runge-Kutta method which requires evaluating 

f(x,y) at a variety of specific arguments. 

In these cases a product form of the polynomial provides better numerical 
( 2 ) 

stability . Let us write 


p n <*) 


n 



(x- r n ) , 


where the t^'s are the roots of I* n (x) which can be preevaluated. The optical 
product evaluator is very simple. Conceptually, it looks like this. 
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A series of modulators M^, is driven by signals (x-r^), ( x-r n ) 

and the product is detected at D. Time delays between x inputs to the stages 
will be unnecessary for any NASA applications. 
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